50,014 results • Page 2 of 1001
trying to use RSeQC for identifying strandness but I am getting error as `Could not retrieve index file` Code infer_experiment.py --i test-sorted.bam -r output.bed12 Output: ``` [E::idx_find_and_load] Could not retrieve index...file for 'test-sorted.bam' Reading reference gene model output.bed12 ... Done Loading SAM/BAM file ... Total 200000 usable reads
updated 5 days ago • Prawesh
Hello, I run kallisto on my data and I am in the process of assigning gene names to my data. I tried to do this in 2 different ways but I get different results. The first way I tried is shown below using the t2g.py from https://github.com/pachterlab/kallisto-transcriptome-indices/releases: #Create the transcripts_to_genes file python t2g.py --use_version <homo_sapiens.grch38…
updated 5 days ago • bioinfo
If I want to see if my transcription factor chipseq correlate with a histone mark chipseq, is this a common practise: bamcompare to get the bigwig file with fold enrichment and then use bigwigsummary and plotcorrelation? But this method gave me very low correlation coefficient...correlate with a histone mark chipseq, is this a common practise: bamcompare to get the bigwig file with fold enrichm…
I have previously used the biomart webportal to dow nload fastas for the 3'utrs of a gene-stable ensemble id list. Typically I limit my output to "MANE Select" as I am trying to get just one
updated 6 days ago • RNAseqer
Hello everyone, I have annotation file like this ``` less -S Sars_cov_2.ASM985889v3.101.gtf | head -20 #!genome-build ASM985889v3 #!genome-version ASM985889v3 #!genome...protein_version "1"; ``` I have a reference genome : sequence.fasta and a bam file : ILS_W_V_558_S2_R1_001_val.bam that looks like this : ``` samtools view ILS_W_V_558_S2_R1_001_val.bam | head NB551648...Parse …
updated 6 days ago • Adyasha
data and not change anything else. I have been able to write some R code that can do this to files in STRUCTURE format as that format is amenable to import into R. I am wondering if it is possible to do something similar...directly to the VCF file or to all the .bed,.fam,.bim files generated from PLINK from a VCF. I have looked into VCFtools and bcfTools but the merge funtions...and collate fu…
updated 6 days ago • ajbarrett98
datasets and I keep getting this error while trying to generate the Gene and Transcript counts files. ``` Dataset Error Report An error occurred while running the tool toolshed.g2.bx.psu.edu/repos/iuc/stringtie/stringtie
updated 6 days ago • trkfs
GT \ --out /data/small_CB_pN \ --sam-verbose 10000000 \ --vcf-verbose 100000 ``` (The .bam file comes from scRNA-seq data using a Parse Biosciences kit, hence the pN UMI tag.) When I run this command, I get a .single file and...empty .best and .sing2 files. I also get this message: terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc I have read...th…
updated 6 days ago • eking28
degenerate ("4d") sites either from the individual genes (pre-allignment), or from the alignment files directly? It seems like this is fairly standard practice for a lot of phylogenomic analysis, going by the literature
updated 6 days ago • J.
of a plant species. I have completed the gene prediction using the Augustus pipeline. The output file is of format `.gff` . Now I want to perform the gene annotation by performing `BLAST` for which I need the coding sequences in...a `.fasta.` file. This is the method that I've thought of approaching. 2. Use the `perl` script `getAnnoFasta.pl` to get the `amino acid sequence...and later t…
updated 6 days ago • Vijith
command but I am getting empty seq_89.vcf.gz output Here is the command I used: Generate your bam file as usual samtools sort -@ 56 seq-89-highquality.bam -o seq-89-highquality.sorted.bam bcftools mpileup -Ou -f KT992094.1.fasta
updated 6 days ago • Ghada
Hey everyone, I am doing a lot of variant calling. So far, I have always used the Ensembl refgenomes with the "number only" nomenclature for the main chromosomes. My default workflow (very simplified) is: Map fastq to ensembl refgenome -&gt; call variants -&gt; annotate variants with VEP. I prefer to use the VEP cache over a gtf/gff files for annotation since this is recommended by …
updated 7 days ago • gernophil
flagstat -@ 20 $MAPPING/${i}.sorted.bam &gt; $MAPPING/${i}.mapping_stats.txt", the output text file is empty. Please suggest a solution
updated 7 days ago • ramendra.sarma
Hello everybody, I wanted to align some files against the reference genome using the following script: files="Chionobathyscus_dewitti_12 Chionobathyscus_dewitti_14...bwa_db=GCA_943594065.1_fChiDew1_genomic.fna.gz (# reference genome) for sample in $files do echo $sample bwa mem -t 2 $bwa_db ${sample}.1.fq.gz ${sample}.2.fq.gz | samtools view -b | samtools sort --threa…
updated 7 days ago • Vahid
Hi, I am looking for a fasta file that contains mouse rRNA sequences, but I noticed that the links I searched on the internet point to some different
updated 7 days ago • octpus616
github.com/kevinblighe/PCAtools [2]: https://github.com/kevinblighe/PCAtools?tab=readme-ov-file#quick-start-deseq2 [3]: https://github.com/kevinblighe/PCAtools?tab=readme-ov-file#quick-start-gene-expression-omnibus
updated 7 days ago • BioinfGuru
Hi there, I'm working on a joint call-set of 47 VCFs which I will be merging with `GLNexus`. Now, I've done this before but, for some reason, since I've added 2 extra samples to the original 45 – total 47 – there have been few issues. The original 45 samples are from the SGDP called with `UnifiedCaller` the 2 extra are archaic Neanderthal and Denisova, which have been called with the same pipe…
updated 8 days ago • Matteo Ungaro
longest transcript variant per gene. Orthofinder provides a script for this but it only applies to files downloaded from Ensembl. Does anyone know a tool that can help me with this? Basically I have these files for each species...braker.aa braker.codingseq braker.gff3 ``` My protein file looks like this: ``` head -n 2000 braker.aa ``` ``` &gt;g176.t1 MTKLTKRLELQMESSRLGLLRSHSRARSSKLASSQSKA…
updated 8 days ago • sansan_96
counts_matrix &lt;- GetAssayData(data, assay='SCT', slot='counts') writeMM(counts_matrix, file=paste0(file='/lustre1/project/stg_00079/students/soniya/seurat/matrixraw.mtx')) # write dimensional reduction matrix...PCA) write.csv (SRT@reductions$pca@cell.embeddings, file='/lustre1/project/stg_00079/students/soniya/seurat/pca.csv', quote=F, row.names=F) libr…
updated 9 days ago • beginner123
OUTPUT_DIR/${base_name}_2_unpaired_trimmed.fastq.gz" # Check if trimmed files already exist trimmed_files_exist=true for file in $trimmed_paired1 $trimmed_unpaired1 $trimmed_paired2 $trimmed_unpaired2...logic here else echo "Trimmed files for $base_name already exist. Skipping." fi …
updated 9 days ago • melissachua90
100000000 -c 1 -endPlugin -runfork1 which seems to run fine. There are a couple errors in the log file. This one appears near the beginning, but it doesn't seem to stop the rest of the script from running: [SQLITE_ERROR] SQL error...number of low quality reads=19326 Timing process (sorting, collapsing, and writing TagCount to file). Process took 2855.360497 milliseconds. tagCntM…
updated 9 days ago • meck
From [this post][1], I'm still struggling with GBS. I know it's pretty particular about file format, and so I'm wondering if there's something wrong with my files that I'm not seeing. First, here are a few of my file names
updated 9 days ago • meck
Hello everyone I am facing challenges with liftover of a VCF file from hg19 to hg38 using GATK because of 'I' and 'D' annotations representing insertions and deletions in the VCF file. Running...rejeceted_variants.vcf --RECOVER_SWAPPED_REF_ALT True Despite converted the VCF file to VCF 4.2 version using vcftools, I'm still having this issue. htsjdk.tribble.TribbleException: The provided…
updated 9 days ago • Omics data mining
Hello, I am using Practical Haplotype Graph v2.2.85.134 to build a pangenome graph using six diploid plant species (12 haplotypes). I was able to go through their [Build and Load module][1]. Then, I simulated 10 million WGS Illumina Novaseq paired end reads from two of my haplotypes (australasica_primary and fallglo_primary) and mapped it to the pangenome using their [Imputation module][2]. The i…
updated 9 days ago • beantkapoor16
better to align to both genomes separately, allowing multimapping for spike-in, and then filter BAM files to exclude reads present in both of them. But I am not sure if I am missing anything, so any input or advice is welcome. Thanks
I got the haps file and sample file after pre-phasing with eagle2. After that, I tried to switch to a vcf file using SHAPEIT4, but it keeps saying...that there is no index file. How can I easily convert to a vcf file? Thank you
updated 10 days ago • SeoGyun
Hi, I'm trying to use PRS-CSx which requires SNP, A1, A2, Beta/OR and P value/SE. I want to use the individual study beta's instead of the random/fixed effect. Is there a way to calculate the p value/se for the individual study betas from this information? Columns in the file: CHR Chromosome code BP Basepair position SNP SNP identifier A1 Effect allele A2 Non-effect allele …
updated 10 days ago • curious_butterfly
dependencies are properly addressed. However, I am encountering challenges with organizing the input files, particularly with respect to arranging the BAM files in the accompanying text file. As part of my analysis, I possess...BAM files corresponding to different species along with their respective GTF files. My primary concern lies in structuring...the input file, particularly regarding the ap…
updated 10 days ago • Lambodarswain316
I'm searching for a long pattern in my fastq file using `bbduk.sh` and `seal.sh`. Both can't find it, even though can `grep` it. ``` $ grep --color=always CGAGTACCCT 10_ID_mRNA_S1_L002_R1_001.fastq
updated 10 days ago • Assa Yeroslaviz
Hi all, I have tfam and tped files from dog data. I need to convert these to map and ped files for a program I will be using. I've used the code ```plink --dog --tfile...including other variations such as ```ls -1 *.tped | sed 's/\_*.tped//'`; do plink --dog --tfile ${file} --recode --out ${file}_ ; done```. However, I keep getting my outputted .ped file thats just a bunch of nucleotides in a …
updated 10 days ago • Samantha
study samples merged into one: 1000 WGS study samples + 2504 1000Genomes samples 2. Created a ".pop" file with "-" for study samples and one of the below listed ancestry for 1000Genome samples in the same order as in ".fam" file. 3. admixture...0.124 0.090 My question is how to assign the ancestry name to the output columns in the ".Q" file? Is this a sorted list of the 5 ancestry names from t…
updated 10 days ago • RT
longest ORF in that identified sequence? Idenfity all repeats in a sequence for all sequences in the FASTA, along with how many times each repeat occurs and which is the most frequent repeat.” The primary problem I think I have...is that I don’t know how to reference the sequences inside a FASTA file beyond what I have already, so my has_codon section of code isn’t working like I think it should…
updated 10 days ago • cput
ANNOVAR to annotate my dog tumor Illumina whole genome sequenced DNA reads. It generated 3 output files: 1. `exonic.variant.function` 2. `variant_function`; and 3. `.log` My exonic variant function file has many unknown sites. Is
updated 10 days ago • sainavyav22
this issue downloading metadata tsv reports from ENA Portal API, it only happens with very large files (over 100 MB): they are saved to TSV before they are fully downloaded, without showing any error or warning. If I try to download...the same file from the browser it's fully downloaded, but it's not what I'm looking for. This is the code: ```py projectID = "PRJNA43021" s = rq.session...True) w…
updated 10 days ago • Giulia
Hello everyone, I'm calculating the length of reads from a BAM file to create plots later on. I haven't filtered out secondary and/or supplementary alignments, and I can't understand why...Hello everyone, I'm calculating the length of reads from a BAM file to create plots later on. I haven't filtered out secondary and/or supplementary alignments, and I can't understand why I
updated 10 days ago • marco.barr
I am using rmats software, here I have settled all dependencies. Now I am facing problem in input file, mainly how to arrange bam file in txt file. I have different species bam and gtf files. There confusion in b1 and b2 during...run file if I have 1 species and 1 gtf file then how can add species to during running time. If I am wrong then guide me to make a correct...input file
updated 10 days ago • Lambodarswain316
Hi, I have some normalized BigWig files and now I want to convert these normalized BigWig files to count matrix. Can anyone give me any advice? I will appreciate
updated 10 days ago • feather-W
for that, I require their DNA sequencing reads. I believe I can obtain these reads from their CRAM files of the normal samples, so I downloaded a slice of the CRAM according to their instructions bin/score-client view --object...id 28358cf3-fba0-51a3-8b93-104bd5d48b23 --reference-file /home/victor/ref-fasta/GRCh38_full_analysis_set_plus_decoy_hla.fa --output-dir /media/victor/c1d5c312-b546-…
updated 11 days ago • Javier
Good morning, I aim to chop an already aligned bam file based on different regions of a gene as follow: samtools view -b -o out.bam -L regions.bed origin.bam samtools sort -o out.sort.bam...out.bam samtools index Then I want to convert the `out.bam` into two unaligned FASTQ files (each member of the read pair parsed to one of the two files) using SamToFastq: …
updated 11 days ago • Lila M
working with two lanes of Hi-C reads (2 forward, 2 reverse). Initially, I was merging the raw FASTQ files before mapping, but I've since been mapping each set of forward and reverse reads independently with BWA-MEM2 and then...merging the BAM files. I've also tried using only a single lane of forward and reverse reads. 2. Initially, I deviated from the VGP pipeline and
updated 11 days ago • Winter
Hello, how do I import a fastq file from my local windows computer into fluent terminal wls
updated 11 days ago • oumo
base) I'm not sure how to interpret this - I assume that some internal python3 file is missing or not found. Possibly pycostat, given the SAT solver errors in installing and creating environments? conda...install -c conda-forge biopython # code to generate error /opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/conda_package_streaming
updated 11 days ago • kacollier
I have some very large WGS BCF files and I would to extract just the first 8 columns, thus reducing to just a 'sites-only' VCF/BCF. Does BCFTOOLS have a canned...I have some very large WGS BCF files and I would to extract just the first 8 columns, thus reducing to just a 'sites-only' VCF/BCF. Does BCFTOOLS have a canned option...t%REF\t%ALT\t%QUAL\t%FILTER\t%INFO\n" but I'm finding t…
updated 11 days ago • Matthew
Hello, I'm annotating a vcf file using ensembl vep and gnomad v4 vcf file: vep --cache --offline --species homo_sapiens --assembly GRCh38 \ --input_file input.vcf...Hello, I'm annotating a vcf file using ensembl vep and gnomad v4 vcf file: vep --cache --offline --species homo_sapiens --assembly GRCh38 \ --input_file input.vcf \ --custom gnomad.vcf.b…
updated 11 days ago • asalimih
Hello all, Data: Paired end, RNASeq data. I had an issue with the featureCounts output Assigned reads are greater than the HISAT mapped on aligned concordantly exactly 1 time ``` From HISAT: aligned concordantly exactly 1 time is 48335140 From featureCounts summary: Assigned: 64074047 ``` Assigned value is 1.32 times greater than HISAT mapping results. It's weird that Assigned value is hig…
updated 11 days ago • Prawesh
Hi there, I am working on the de novo analysis of bacteria. This bacteria is related to Bacillus paranthracis, which was identified using rMLST, TYGS, and BLAST analysis. The genome assembly file was then provided to the KEGG-KASS tool. Although the prokaryotic dataset was selected for analysis, the identified KO...paranthracis, which was identified using rMLST, TYGS, and BLAST analysis. The ge…
updated 11 days ago • mathavanbioinfo
Hello everyone, I'm currently working with VCF files of mutations from the TCGA dataset using the hg38 assembly. To further my analysis, I'm interested in comparing mutation
updated 11 days ago • elisheva
my current experience in IT financial system support got certification in bioinformatics and python/biopython
updated 12 days ago • shehab
fetus (the one inherited from the mother by the fetus). The final results I'm looking for are a bam file with the reads from the unique chromosome of the mother, another with the reads from the unique chromosome of the child
updated 12 days ago • njornet
Hey everyone I need some help with SAMtools (v 1.3.1) and such. I have 2 files that I want to align, with the ultimate goal of understanding what percentage of the reference genome is covered by...Hey everyone I need some help with SAMtools (v 1.3.1) and such. I have 2 files that I want to align, with the ultimate goal of understanding what percentage of the reference genome is covered by…
updated 12 days ago • Lemonhope
50,014 results • Page 2 of 1001
Traffic: 1919 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6